'Assume absolute stereo flag', '|r|' suffix for cd_smiles

26-07-2004 09:21:18

What does the 'Assume absolute stereo flag' mean in the Create table dialog?


What has it have to do with the cd_smiles ' |r|' suffix?


Why have other customers complained about the ' |r|' suffix?

ChemAxon 9c0afc9aaf

26-07-2004 09:31:58

1. Chiral flag issue:





The 'Assume absolute stereo flag' option means, that all query and target structures are treated as absolute stereo during search.


This table option can be changed any time in the regeneration dialog, you do not have to actually regenerate the table.





There are some MDL files, where the chiral flag is accidentally missing, but should be treated as absolute stereo.


Since JChem 2.3 there is an import option to set the chiral flag during import.


In this case no "|r|" suffixes appear of course (but other features may still be present as a suffix).





MDL files have a so-called "chiral flag".


If this flag is missing (not set), the structure is relative stereo.


This means that in terms of chiral centers the structure is exactly the same as drawn, _or_ it's mirror image (every chiral center in the opposite configuration).





This feature is not supported by SMILES, so JChem uses a SMILES extension to represent these (see below).





If the "Absolute stereo flag" is checked, the "|r|" will be ignored by the search, the structure will be treated as absolute stereo.





2. "cd_smiles" suffix issue (in general)





JChem uses the "cd_smiles" column during search.


The structure may contain features that are not supported by SMILES.


In this case we ensure the correctness of the search by adding extra information to the SMILES:





http://www.chemaxon.com/marvin/doc/user/cxsmiles-doc.html





Some customers (try to) use "cd_smiles" for displaying the structure in SMILES format with a non-ChemAxon tool, that cannot recognize these extensions.





In this case


- the smiles can be simply truncated after the first space


or


- a molecule can be created from "cd_structure", and then converted to SMILES with Molecule.toFormat("smiles")





The latter method produces SMILES from the original (not standardized) structure.


Naturally with both methods you loose that part of the information that cannot be represented in SMILES.





Moreover sometimes there are some rare features in the structure that cannot be represented even in our smiles extension.


In this case the "cd_smiles" value is NULL and JChem uses the "cd_structure" column for the search process (slower).


We are gradually incorporating every feature (that influences the search process) into our SMILES extension, until then you may expect NULL values in rare cases.





In general, we recommend using the "cd_structure" column for display.


This column always stores the original structures in the original format without any information loss (e.g. Marvin Documents may even have graphical objects beside the structure).





We recommend Marvin Applets and Marvin Beans for displaying the structures, since they support every format that can be found in the database table.





The "cd_smiles" column is rather for internal use to speed up the search process.